Search CORE

28 research outputs found

Extracting adverse drug reactions and their context using sequence labelling ensembles in TAC2017

Author: Belousov Maksim
Dixon William
Milosevic Nikola
Nenadic Goran
Publication venue
Publication date: 01/01/2018
Field of study

Adverse drug reactions (ADRs) are unwanted or harmful effects experienced after the administration of a certain drug or a combination of drugs, presenting a challenge for drug development and drug administration. In this paper, we present a set of taggers for extracting adverse drug reactions and related entities, including factors, severity, negations, drug class and animal. The systems used a mix of rule-based, machine learning (CRF) and deep learning (BLSTM with word2vec embeddings) methodologies in order to annotate the data. The systems were submitted to adverse drug reaction shared task, organised during Text Analytics Conference in 2017 by National Institute for Standards and Technology, archiving F1-scores of 76.00 and 75.61 respectively.Comment: Paper describing submission for TAC ADR shared tas

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

GNTeam at 2018 n2c2:Feature-augmented BiLSTM-CRF for drug-related entity recognition in hospital discharge summaries

Author: Alfattni Ghada
Alrdahi Haifa
Belousov Maksim
Milosevic Nikola
Nenadic Goran
Publication venue
Publication date: 23/09/2019
Field of study

Monitoring the administration of drugs and adverse drug reactions are key parts of pharmacovigilance. In this paper, we explore the extraction of drug mentions and drug-related information (reason for taking a drug, route, frequency, dosage, strength, form, duration, and adverse events) from hospital discharge summaries through deep learning that relies on various representations for clinical named entity recognition. This work was officially part of the 2018 n2c2 shared task, and we use the data supplied as part of the task. We developed two deep learning architecture based on recurrent neural networks and pre-trained language models. We also explore the effect of augmenting word representations with semantic features for clinical named entity recognition. Our feature-augmented BiLSTM-CRF model performed with F1-score of 92.67% and ranked 4th for entity extraction sub-task among submitted systems to n2c2 challenge. The recurrent neural networks that use the pre-trained domain-specific word embeddings and a CRF layer for label optimization perform drug, adverse event and related entities extraction with micro-averaged F1-score of over 91%. The augmentation of word vectors with semantic features extracted using available clinical NLP toolkits can further improve the performance. Word embeddings that are pre-trained on a large unannotated corpus of relevant documents and further fine-tuned to the task perform rather well. However, the augmentation of word embeddings with semantic features can help improve the performance (primarily by boosting precision) of drug-related named entity recognition from electronic health records

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Point mutations affecting yeast prion propagation change the structure of its amyloid fibrils

Author: Belousov Mikhail V.
Bondarev Stanislav A.
Kajava Andrey V.
Kuznetsova Irina M.
Llanos Manuel Augusto
Sulatskaya Anna I.
Sulatsky Maksim I.
Trubitsina Nina P.
Turoverov Konstantin K.
Zhouravleva Galina A.
Publication venue
Publication date: 16/11/2021
Field of study

We investigated the effect of the point substitutions in the N-terminal domain of the yeast prion protein Sup35 (Sup35NMp) on the structure of its amyloid fibrils. As the objects of the study, proteins with mutations that have different influence on the [PSI+] prion propagation, but do not prevent the aggregation of Sup35NMp in vitro were chosen. The use of the wide range of physico-chemical methods allowed us to show significant differences in the structure of these aggregates, their physical size, clumping tendency. Also we demonstrated that the fluorescent probe thioflavin T (ThT) can be successfully used for investigation of subtle changes in the structural organization of fibrils formed from various Sup35NMp. The obtained results and our theoretical predictions allowed us to conclude that some of selected amino acid substitutions delimit the region of the protein that forms the core of amyloid fibrils, and change the fibrils structure. The relationship of structural features of in vitro Sup35NMp amyloid aggregates with the stability of the [PSI+] prion in vivo allowed us to suggest that oligopeptide repeats (R) of the amyloidogenic N-terminal domain of Sup35NMp from R0 to R2 play a key role in protein aggregation. Their arrangement rather than just presence is critical for propagation of the strong [PSI+] prion variants. The results confirm the suitability of the proposed combination of theoretical and empirical approaches for identifying changes in the amyloid fibrils structure, which, in turn, can significantly affect both the functional stability of amyloid fibrils and their pathogenicity.Laboratorio de Investigación y Desarrollo de Bioactivo

Servicio de Difusión de la Creación Intelectual

Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health (SMM4H)-2017 shared task

Author: Abeed Sarker
Anthony Rios
Berry de Bruijn
Debanjan Mahata
Farrokh Mehryary
Filip Ginter
Goran Nenadic
Graciela Gonzalez-Hernandez
Jasper Friedrichs
Kai Hakala
Maksim Belousov
Ramakanth Kavuluru
Saif M. Mohammad
Sifei Han
Svetlana Kiritchenko
Tung Tran
Publication venue: 'Oxford University Press (OUP)'
Publication date: 28/10/2022
Field of study

Objective: We executed the Social Media Mining for Health (SMM4H) 2017 shared tasks to enable the community-driven development and large-scale evaluation of automatic text processing methods for the classification and normalization of health-related text from social media. An additional objective was to publicly release manually annotated data.Materials and Methods: We organized 3 independent subtasks: automatic classification of self-reports of 1) adverse drug reactions (ADRs) and 2) medication consumption, from medication-mentioning tweets, and 3) normalization of ADR expressions. Training data consisted of 15 717 annotated tweets for (1), 10 260 for (2), and 6650 ADR phrases and identifiers for (3); and exhibited typical properties of social-media-based health-related texts. Systems were evaluated using 9961, 7513, and 2500 instances for the 3 subtasks, respectively. We evaluated performances of classes of methods and ensembles of system combinations following the shared tasks.Results: Among 55 system runs, the best system scores for the 3 subtasks were 0.435 (ADR class F1-score) for subtask-1, 0.693 (micro-averaged F1-score over two classes) for subtask-2, and 88.5% (accuracy) for subtask-3. Ensembles of system combinations obtained best scores of 0.476, 0.702, and 88.7%, outperforming individual systems.Discussion: Among individual systems, support vector machines and convolutional neural networks showed high performance. Performance gains achieved by ensembles of system combinations suggest that such strategies may be suitable for operational systems relying on difficult text classification tasks (eg, subtask-1).Conclusions: Data imbalance and lack of context remain challenges for natural language processing of social media text. Annotated data from the shared task have been made available as reference standards for future studies (http://dx.doi.org/10.17632/rxwfb3tysd.1).</div

UTUPub

Learning explainable representations of concepts in specialised languages: experiments in healthcare social media

Author: Belousov Maksim
Publication venue
Publication date: 01/08/2020
Field of study

The University of Manchester - Institutional Repository

Extracting adverse drug reactions and their context using sequence labelling ensembles

Author: Belousov Maksim
Dixon William
Milosevic Nikola
Nenadic Goran
Publication venue
Publication date: 01/04/2018
Field of study

The University of Manchester - Institutional Repository

Extracting adverse drug reactions and their context using sequence labelling ensembles in TAC2017

Author: Belousov Maksim
Dixon William
Milosevic Nikola
Nenadic Goran
Publication venue
Publication date: 01/01/2018
Field of study

The University of Manchester - Institutional Repository

Extracting Drug Names and Associated Attributes From Discharge Summaries:Text Mining Study

Author: Alfattni Ghada
Belousov Maksim
Nenadic Goran
Peek Niels
Publication venue: 'JMIR Publications Inc.'
Publication date: 05/05/2021
Field of study

The University of Manchester - Institutional Repository

Using Twitter to mine sleep related information from people who declare a diagnosis of a psychotic disorder

Author: Goran Nenadic
Maksim Belousov
Mladen Dinev
Natalie Berry
Rohan Morris
Publication venue: 'Swansea University'
Publication date: 01/04/2017
Field of study

ABSTRACT Objectives Our group has investigated the occurrence of psychotic(-like) experiences (PLEs) in Twitter posts, namely auditory hallucinations. Tweets classified as potentially related to auditory hallucinations were proportionately higher between 23:00 and 5:00 in comparison to tweets not classified. This may indicate a clinically significant relationship between sleep and PLEs in the general population, a notion supported by the literature. Based on our previous investigation, the current study aimed to explore whether this methodology could be amended to generate datasets regarding sleep experiences in people who self-report a diagnosis of a psychotic disorder. Approach The current investigation seeks to establish if it is feasible to generate anonymised datasets regarding sleep by extracting information from the timelines of people who self-report a psychotic diagnosis. A text mining method was implemented that utilised rule-based semantic filters that aimed to identify self-reported diagnoses. This focused on occurrences of personal and possessive pronouns to detect the subjectivity of tweets, as well as potential diagnostic verb indicators and any mentions of other related factors. For each diagnostic tweet, we collected information from user timelines. A sleep-related classifier was then implemented, which used lexical features (e.g. bag-of-words, part-of-speech tags) to predict whether a given tweet refers to sleep-related experience. Results After training the classifier on the bag-of-words model, the most informative words which contributed to the performance of the classifier were: ‘sleep’, ‘can’t awake’, ‘never’, ‘stress’. Part-of-speech tags (e.g. verbs, adverbs) were also important features. The classification accuracy of the ‘bag-of-words’ model was better than the ‘part-of-speech’ model. Through the method outlined herein, we were able to improve the quality of the generated datasets in comparison to the previous investigation. This methodology also facilitated the mining of individual Twitter users timelines who stated a personal diagnosis. To this end, an additional filter was implemented to identify tweets regarding sleep experience. The potential relationship between sentiment and temporality expressed in diagnosis and sleep experiences are also discussed. Conclusions The results from this study have implications for mental health research on Twitter. Specifically, the refinements in the methodology enabled retrieval of two high quality datasets regarding psychosis and sleep. Therefore it is feasible other psychosis-related phenomena (e.g. visual hallucinations, delusions, medication) could also be applied as separate filters to create one dataset of psychosis-related experiences within individuals diagnosed with psychosis

Directory of Open Access Journals

Patient discussions of glucocorticoid-related side effects within an online health community forum

Author: Belousov Maksim
Dixon William
Hassan Lamiece
Nenadic Goran
Vivekanantham Arani
Publication venue
Publication date: 10/02/2030
Field of study

The University of Manchester - Institutional Repository